AITopics

Country:

North America > Haiti (0.14)
Asia > Philippines > Luzon > Ilocos Region > Province of Pangasinan (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(38 more...)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Neural Information Processing SystemsFeb-12-2026, 16:51:10 GMT

The Cluster Description Problem - Complexity Results, Formulations and Approximations

Ian Davidson, Antoine Gourru, S Ravi

Neural Information Processing Systems http://nips.cc/

constraint, descriptor, formulation, (12 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Virginia (0.04)
North America > United States > Texas (0.04)
(10 more...)

Genre: Research Report (0.46)

Industry: Information Technology > Services (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Data Science > Data Mining (0.88)

Neural Information Processing SystemsNov-20-2025, 15:57:46 GMT

The Cluster Description Problem - Complexity Results, Formulations and Approximations

Ian Davidson, Antoine Gourru, S Ravi

There are many clustering algorithms which perform well towards their aim of finding cohesive groups of instances.

data mining, descriptor, machine learning, (17 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Virginia (0.04)
North America > United States > Texas (0.04)
(10 more...)

Genre: Research Report (0.46)

Industry:

Information Technology > Services (0.47)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Data Science > Data Mining (0.88)

Neural Information Processing SystemsOct-8-2025, 22:13:22 GMT

74bb24dca8334adce292883b4b651eda-Paper-Conference.pdf

large language model, machine learning, natural language, (17 more...)

Country:

North America > Haiti (0.14)
Asia > Philippines > Luzon > Ilocos Region > Province of Pangasinan (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(38 more...)

Genre: Overview (0.46)

Technology:

Information Technology > Communications (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
(2 more...)

arXiv.org Artificial IntelligenceMar-28-2025

Using AI to Summarize US Presidential Campaign TV Advertisement Videos, 1952-2012

Breuer, Adam, Dietrich, Bryce J., Crespin, Michael H., Butler, Matthew, Pyrse, J. A., Imai, Kosuke

This paper introduces the largest and most comprehensive dataset of US presidential campaign television advertisements, available in digital format. The dataset also includes machine-searchable transcripts and high-quality summaries designed to facilitate a variety of academic research. To date, there has been great interest in collecting and analyzing US presidential campaign advertisements, but the need for manual procurement and annotation led many to rely on smaller subsets. We design a large-scale parallelized, AI-based analysis pipeline that automates the laborious process of preparing, transcribing, and summarizing videos. We then apply this methodology to the 9,707 presidential ads from the Julian P. Kanter Political Commercial Archive. We conduct extensive human evaluations to show that these transcripts and summaries match the quality of manually generated alternatives. We illustrate the value of this data by including an application that tracks the genesis and evolution of current focal issue areas over seven decades of presidential elections. Our analysis pipeline and codebase also show how to use LLM-based tools to obtain high-quality summaries for other video datasets.

large language model, machine learning, natural language, (22 more...)

2503.22589

Country:

North America > United States > Arkansas (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Wisconsin (0.04)
(15 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Marketing (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management > Search (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Viet-An Nguyen, Jordan L. Ying, Philip Resnik, Jonathan Chang

Learning a Concept Hierarchy from Multi-labeled Documents

Neural Information Processing SystemsFeb-9-2025, 04:41:54 GMT

While topic models can discover patterns of word usage in large corpora, it is difficult to meld this unsupervised structure with noisy, human-provided labels, especially when the label space is large.

data mining, machine learning, natural language, (19 more...)

Country:

Asia > North Korea (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Colorado > Boulder County > Boulder (0.14)
(22 more...)

Industry:

Government > Military (1.00)
Law (0.94)
Government > Regional Government > North America Government > United States Government (0.93)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.50)
(2 more...)

arXiv.org Artificial IntelligenceDec-16-2024

Interpretable LLM-based Table Question Answering

Giang, null, Nguyen, null, Brugere, Ivan, Sharma, Shubham, Kariyappa, Sanjay, Nguyen, Anh Totti, Lecue, Freddy

Interpretability for Table Question Answering (Table QA) is critical, particularly in high-stakes industries like finance or healthcare. Although recent approaches using Large Language Models (LLMs) have significantly improved Table QA performance, their explanations for how the answers are generated are ambiguous. To fill this gap, we introduce Plan-of-SQLs ( or POS), an interpretable, effective, and efficient approach to Table QA that answers an input query solely with SQL executions. Through qualitative and quantitative evaluations with human and LLM judges, we show that POS is most preferred among explanation methods, helps human users understand model decision boundaries, and facilitates model success and error identification. Furthermore, when evaluated in standard benchmarks (TabFact, WikiTQ, and FetaQA), POS achieves competitive or superior accuracy compared to existing methods, while maintaining greater efficiency by requiring significantly fewer LLM calls and database queries.

explanation, large language model, machine learning, (22 more...)

2412.12386

Country:

North America > United States > Iowa (0.07)
North America > United States > Michigan (0.05)
North America > United States > Tennessee (0.05)
(39 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Banking & Finance (1.00)
Leisure & Entertainment > Sports > Tennis (0.46)
Leisure & Entertainment > Sports > Golf (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Sharma, Kartik, Kumar, Peeyush, Li, Yunqing

OG-RAG: Ontology-Grounded Retrieval-Augmented Generation For Large Language Models

arXiv.org Artificial IntelligenceDec-11-2024

This paper presents OG-RAG, an Ontology-Grounded Retrieval Augmented Generation method designed to enhance LLM-generated responses by anchoring retrieval processes in domain-specific ontologies. While LLMs are widely used for tasks like question answering and search, they struggle to adapt to specialized knowledge, such as industrial workflows or knowledge work, without expensive fine-tuning or sub-optimal retrieval methods. Existing retrieval-augmented models, such as RAG, offer improvements but fail to account for structured domain knowledge, leading to suboptimal context generation. Ontologies, which conceptually organize domain knowledge by defining entities and their interrelationships, offer a structured representation to address this gap. OG-RAG constructs a hypergraph representation of domain documents, where each hyperedge encapsulates clusters of factual knowledge grounded using domain-specific ontology. An optimization algorithm then retrieves the minimal set of hyperedges that constructs a precise, conceptually grounded context for the LLM. This method enables efficient retrieval while preserving the complex relationships between entities. OG-RAG applies to domains where fact-based reasoning is essential, particularly in tasks that require workflows or decision-making steps to follow predefined rules and procedures. These include industrial workflows in healthcare, legal, and agricultural sectors, as well as knowledge-driven tasks such as news journalism, investigative research, consulting and more. Our evaluations demonstrate that OG-RAG increases the recall of accurate facts by 55% and improves response correctness by 40% across four different LLMs. Additionally, OG-RAG enables 30% faster attribution of responses to context and boosts fact-based reasoning accuracy by 27% compared to baseline methods.

large language model, machine learning, natural language, (15 more...)

2412.15235

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.04)
Asia > India > Madhya Pradesh > Bhopal (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(2 more...)

Genre:

Workflow (0.89)
Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Materials > Chemicals > Agricultural Chemicals (0.96)
Food & Agriculture > Agriculture > Pest Control (0.71)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Mashaabi, Malak, Al-Khalifa, Shahad, Al-Khalifa, Hend

A Survey of Large Language Models for Arabic Language and its Dialects

arXiv.org Artificial IntelligenceOct-26-2024

This survey offers a comprehensive overview of Large Language Models (LLMs) designed for Arabic language and its dialects. It covers key architectures, including encoder-only, decoder-only, and encoder-decoder models, along with the datasets used for pre-training, spanning Classical Arabic, Modern Standard Arabic, and Dialectal Arabic. The study also explores monolingual, bilingual, and multilingual LLMs, analyzing their architectures and performance across downstream tasks, such as sentiment analysis, named entity recognition, and question answering. Furthermore, it assesses the openness of Arabic LLMs based on factors, such as source code availability, training data, model weights, and documentation. The survey highlights the need for more diverse dialectal datasets and attributes the importance of openness for research reproducibility and transparency. It concludes by identifying key challenges and opportunities for future research and stressing the need for more inclusive and representative models.

arabic, large language model, machine learning, (18 more...)

2410.20238

Country:

Africa > Middle East > Morocco (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(42 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.47)

Industry:

Education (1.00)
Media > News (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-24-2024

End-to-end Training for Recommendation with Language-based User Profiles

Gao, Zhaolin, Zhou, Joyce, Dai, Yijia, Joachims, Thorsten

Many online platforms maintain user profiles for personalization. Unfortunately, these profiles are typically not interpretable or easily modifiable by the user. To remedy this shortcoming, we explore natural language-based user profiles, as they promise enhanced transparency and scrutability of recommender systems. While existing work has shown that language-based profiles from standard LLMs can be effective, such generalist LLMs are unlikely to be optimal for this task. In this paper, we introduce LangPTune, the first end-to-end learning method for training LLMs to produce language-based user profiles that optimize recommendation effectiveness. Through comprehensive evaluations of LangPTune across various training configurations and benchmarks, we demonstrate that our approach significantly outperforms existing profile-based methods. In addition, it approaches performance levels comparable to state-of-the-art, less transparent recommender systems, providing a robust and interpretable alternative to conventional systems. Finally, we validate the relative interpretability of these language-based user profiles through user studies involving crowdworkers and GPT-4-based evaluations. Implementation of LangPTune can be found at https://github.com/ZhaolinGao/LangPTune.

large language model, machine learning, natural language, (19 more...)